skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wu, Jie"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 4, 2026
  2. In the past decade, topological data analysis has emerged as a powerful algebraic topology approach in data science. Although knot theory and related subjects are a focus of study in mathematics, their success in practical applications is quite limited due to the lack of localization and quantization. We address these challenges by introducing knot data analysis (KDA), a paradigm that incorporates curve segmentation and multiscale analysis into the Gauss link integral. The resulting multiscale Gauss link integral (mGLI) recovers the global topological properties of knots and links at an appropriate scale and offers a multiscale geometric topology approach to capture the local structures and connectivities in data. By integration with machine learning or deep learning, the proposed mGLI significantly outperforms other state-of-the-art methods across various benchmark problems in 13 intricately complex biological datasets, including protein flexibility analysis, protein–ligand interactions, human Ether-à-go-go-Related Gene potassium channel blockade screening, and quantitative toxicity assessment. Our KDA opens a research area—knot deep learning—in data science. 
    more » « less
  3. Although the tropical intraseaonal variability (TISV), as the most important predictability sources for subseasonal-to-seasonal (S2S) prediction, is dominated by Madden-Julian oscillation (MJO), its significant fraction does not always share the canonical MJO features, especially when the convective activity arrives at Maritime Continent. In this study, using principal oscillation pattern (POP) analysis on the combined fields of daily equatorial convection and zonal wind, two distinct leading TISV modes with relatively slower e-folding decay rates are identified. One is an oscillatory mode with the period of 51 days and e-folding time of 19 days, capturing the eastward propagating (EP) feature of the canonical MJO. The other is a non-oscillatory damping mode with e-folding time of 13.6 days, capturing a standing dipole (SD) with convection anomalies centered over the Maritime Continent and tropical central Pacific, respectively. Compared to the EP mode, the leading moisture anomalies at low level to the east of convection center are diminish for the SD mode, and instead, the strong negative anomalies of moisture and subsidence motion emerge in the tropical central Pacific area, which may be responsible for the distinct propagation features. Without filtering methods used, timeseries of the two POPs could be applied to the real-time monitoring of EP and SD events in the phase-space diagram. The two modes can serve as the simple and objective approach for a better characterization for diverse natures of TISV beyond the canonical MJO description, which may further shed light on dynamics of the TISV and its predictability. 
    more » « less
  4. Interactions among the underlying agents of a complex system are not only limited to dyads but can also occur in larger groups. Currently, no generic model has been developed to capture high-order interactions (HOI), which, along with pairwise interactions, portray a detailed landscape of complex systems. Here, we integrate evolutionary game theory and behavioral ecology into a unified statistical mechanics framework, allowing all agents (modeled as nodes) and their bidirectional, signed, and weighted interactions at various orders (modeled as links or hyperlinks) to be coded into hypernetworks. Such hypernetworks can distinguish between how pairwise interactions modulate a third agent (active HOI) and how the altered state of each agent in turn governs interactions between other agents (passive HOI). The simultaneous occurrence of active and passive HOI can drive complex systems to evolve at multiple time and space scales. We apply the model to reconstruct a hypernetwork of hexa-species microbial communities, and by dissecting the topological architecture of the hypernetwork using GLMY homology theory, we find distinct roles of pairwise interactions and HOI in shaping community behavior and dynamics. The statistical relevance of the hypernetwork model is validated using a series of in vitro mono-, co-, and tricultural experiments based on three bacterial species. 
    more » « less
  5. The hydrodynamics of a self-propelling swimmer undergoing intermittent S-start swimming are investigated extensively with varying duty cycle$$DC$$, swimming period$$T$$, and tailbeat amplitude$$A$$. We find that the steady time-averaged swimming speed$$\bar {U}_x$$increases directly with$$A$$, but varies inversely with$$DC$$and$$T$$, where there is a maximal improvement of$$541.29\,\%$$over continuous cruising swimming. Our results reveal two scaling laws, in the form of input versus output relations, that relate the swimmer's kinematics to its hydrodynamic performance: swimming speed and efficiency. A smaller$$DC$$causes increased fluctuations in the swimmer's velocity generation. A larger$$A$$, on the other hand, allows the swimmer to reach steady swimming more quickly. Although we set out to determine scaling laws for intermittent S-start swimming, these scaling laws extend naturally to burst-and-coast and continuous modes of swimming. Additionally, we have identified, categorized and linked the wake structures produced by intermittent S-start swimmers with their velocity generation. 
    more » « less
  6. Sproul, Duncan (Ed.)
    Characterizing DNA methylation patterns is important for addressing key questions in evolutionary biology, development, geroscience, and medical genomics. While costs are decreasing, whole-genome DNA methylation profiling remains prohibitively expensive for most population-scale studies, creating a need for cost-effective, reduced representation approaches (i.e., assays that rely on microarrays, enzyme digests, or sequence capture to target a subset of the genome). Most common whole genome and reduced representation techniques rely on bisulfite conversion, which can damage DNA resulting in DNA loss and sequencing biases. Enzymatic methyl sequencing (EM-seq) was recently proposed to overcome these issues, but thorough benchmarking of EM-seq combined with cost-effective, reduced representation strategies is currently lacking. To address this gap, we optimized the Targeted Methylation Sequencing protocol (TMS)—which profiles ~4 million CpG sites—for miniaturization, flexibility, and multispecies use. First, we tested modifications to increase throughput and reduce cost, including increasing multiplexing, decreasing DNA input, and using enzymatic rather than mechanical fragmentation to prepare DNA. Second, we compared our optimized TMS protocol to commonly used techniques, specifically the Infinium MethylationEPIC BeadChip (n = 55 paired samples) and whole genome bisulfite sequencing (n = 6 paired samples). In both cases, we found strong agreement between technologies (R2 = 0.97 and 0.99, respectively). Third, we tested the optimized TMS protocol in three non-human primate species (rhesus macaques, geladas, and capuchins). We captured a high percentage (mean = 77.1%) of targeted CpG sites and produced methylation level estimates that agreed with those generated from reduced representation bisulfite sequencing (R2 = 0.98). Finally, we confirmed that estimates of 1) epigenetic age and 2) tissue-specific DNA methylation patterns are strongly recapitulated using data generated from TMS versus other technologies. Altogether, our optimized TMS protocol will enable cost-effective, population-scale studies of genome-wide DNA methylation levels across human and non-human primate species. 
    more » « less
    Free, publicly-accessible full text available May 22, 2026